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The surface code provides extremely realistic physical targets for the implementation of a large- 
scale quantum computer, requiring only a 2-D lattice of qubits with nearest neighbor interactions 
and error rates below approximately 1%. Current surface code schemes are, however, rather slow 
when it comes to computation. We show how to perform surface code quantum computation in a 
time-optimal manner, as fast as any fault-tolerant quantum computation scheme is believed to be 
able to run. 



Quantum computers promise efficient solution of many 
problems that are intractable on a classical computer, 
from factoring [1 j, to simulating quantum physics [2], to 
problems in knot theory [3]. A comprehensive catalogue 
of known quantum algorithms can be found at [4] . 

Quantum systems are fragile, and research into effi- 
cient quantum error correction schemes is ongoing. The 
earliest schemes [5H7] and some modern schemes [8HTU] 
suppress errors best when it is assumed that qubits sep- 
arated by arbitrary distances can interact in constant 
time with constant error rate. By contrast, surface code 
schemes pTHT5] suppress errors very well given only a 
2-D square lattice of qubits with nearest neighbor inter- 
actions. Threshold error rates of approximately 1% have 
been found in simulations [T6J [17], with proven expo- 
nential suppression of error with increasing lattice size 
provided all quantum gates have error rates below the 
threshold [18 . Furthermore, there are strong theoretical 
reasons to believe the surface code is the lowest over- 
head code that will ever exist for a 2-D nearest neighbor 
architecture [T9] . 

Error corrected quantum computation involves a se- 
quence of unitary operations and measurements. All 
known forms of universal error corrected quantum com- 
putation use intermediate measurement results to deter- 
mine future unitary operations. It is strongly believed 
that this is unavoidable [20j [21] . 

In the surface code, logical T and gates involve a 
measurement dependent future S gate (fig. [TJ. Since the 
same circuit probabilistically implements either T or T\ 
discussion shall henceforth primarily refer to the T gate. 
Given a general qubit state = a\0) + j3 |1), T maps 
to a |0) + /3e i7r / 4 |1). S maps |^) to a |0) + f3i |1). 

Surface code quantum computation is achieved by 
switching off regions of error detection in a careful man- 
ner [15] . The details of how and why are not important 
for the purposes of this paper. A space-time region in 
which error detection has been turned off is called a de- 
fect. A defect that permits chains of Z (X) errors to end 
undetectably on its surface is called a primal (dual) de- 
fect. The minimum number of errors required to encircle 
any defect or topologically nontrivially connect defects 
is called the distance of that surface code. The distance 
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FIG. 1: Probabilistic circuit used to implement logical T and 
Mz represents Z basis measurement, the connected dot 
and target symbol is a controlled-NOT (CNOT) gate that 
inverts the target if the control qubit is |1). Time runs left 
to right. Given an input state \A) = (|0) + e i7r/4 |l))/^2, T 
is implemented if Mz = (+1 eigenstate of Z) and if 
Mz — 1 (-1 eigenstate of Z). Both measurement results are 
equally likely. If the measurement result indicates the desired 
gate did not occur, an S gate is applied using the circuit of 
fig. [2] [H [23]. See text for definitions of T and S. 
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FIG. 2: Quantum circuit using a state \Y) = (\0)+i \1))/V2 to 
implement an S gate. The H represents Hadamard (H |0) = 
|+), H\l) = |— )). The procedure for performing H within 
the surface code is described in [24]. 



d measures the strength of the error correction provided 
errors are random and independent, since long chains of 
errors are then exponentially unlikely. 

Canonical patterns of defects implementing a logical 
T gate and its associated feedforward measurement con- 
trolled S gate are shown in fig. [3j Given square cross- 
section defects of circumference d and defects separated 
by the complete pattern has a time depth of 45<i/4, 
measured in rounds of error detection. A single round of 
error detection takes as long as the Z error detection cir- 
cuit of fig. [§i (the X error detection circuit is shorter and 
all circuits are implemented in parallel) . The complete T 
gate shown in fig. [3] would therefore take significant time 
to implement. 

By contrast, all other surface code gates, CNOT, H, 
S, initialization and measurement in the logical X and Z 
bases, do not require feedforward measurement and can 
therefore effectively be implemented in no time. Figs. [4b- 
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FIG. 3: Patterns of defects corresponding to a complete T 
gate (flgs.[l}{2]) when (a) M z = -1, (b) M z = +1. In the lat- 
ter case, we have d rounds of error detection to decide whether 
or not to proceed with the corrective S gate. Note that the 
incomplete section of dual defect is topologically disconnected 
and hence does not effect the computation. 
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FIG. 4: a. Quantum circuit used to detect Z errors in the 
surface code. This is the most complex and execution time 
limiting part of surface code error detection, b. A quantum 
circuit that at first glance require significant time to imple- 
ment. Time runs left to right. 



[5] give an example of a CNOT and H circuit implemented 
in effectively time — an arbitrary number of these cir- 
cuits can be chained together sequentially and executed 
in constant time. Note that space-time volume is still 
required. This perhaps surprising capability of the sur- 
face code is a generic property of any quantum error cor- 
rection code performing a circuit involving only Clifford 
group gates (the group of gates generated by CNOT, H 
and S). 

Motivated by the above, it makes sense to define the 
T depth of a quantum circuit as the minimum number 
of T gates that need to be implemented sequentially to 
complete execution [25]. The quantum circuit THT has 
proven minimum T depth 2 irrespective of how many an- 
cilla qubits are made available and how many additional 
Clifford group gates are used [26]. The T depth, and the 
amount of time required to implement a T gate, give the 
minimum execution time of the quantum circuit. Find- 



FIG. 5: An effectively time surface code implementation 
of the quantum circuit of fig. ^Jx Time runs vertically. An 
arbitrary number of copies of this defect structure can be 
chained together without taking additional time. 



ing minimum T depth versions of quantum circuits is an 
important and active area of research [27] [28] . 

In this work, assuming fast classical processing of poly- 
nomial complexity functions [T71 [29] , we describe how to 
perform a surface code T gate in the time required to 
perform a single physical qubit measurement, which is 
optimal for any quantum error correction code under the 
assumption that feedforward is required at all. Given 
a quantum computer with physical qubit measurement 
time t m , a quantum computation with T depth n would 
asymptotically take just nt m . Our technique increases 
the speed of surface code quantum computation by two to 
three orders of magnitude, depending on the problem size 
and physical measurement time, without significantly in- 
creasing the required space-time volume. Note, however, 
that faster execution and fixed space-time volume does 
imply the need for additional qubits. 

The first step towards achieving time-optimal com- 
putation is achieving time-optimal logical measurement. 
Given two primal defects that collectively represent a sin- 
gle logical qubit, we can achieve this by using a flat cap 
defect as shown in figs.[6^-b. The data we wish to collect 
is the chain of corrected measurement results shown in 
fig. [6^i. Error correction involves minimum weight perfect 
matching [29H3T] , which is a local procedure provided the 
error rate is below threshold. This means that by using 
a sufficiently large we can make it exponentially un- 
likely that measurement data outside the underside of 
the flat cap will be required to reliably correct the chain 
of measure results we are interested in. A single layer 
of measurements over a small patch of the surface code 
is thus sufficient to perform a time-optimal local logical 
measurement. 
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FIG. 6: a. Lattice of qubits corresponding to the underside 
of fig. [6}). These qubits will be measured in the Z basis to 
start the table-shaped defect. The product of the corrected 
highlighted vertical chain of measurement results will form 
the logical measurement result. Paths of errors of length 2d 
are shown that have the potential to corrupt the logical mea- 
surement result. By choosing an appropriately large such 
corruption can be made exponentially unlikely, b. Flat table- 
like defect used to perform logical measurement in the time of 
a single physical measurement, neglecting classical processing, 
c. Quantum circuit enabling selective destination teleporta- 
tion. By measuring the first (second) qubit, data will be tele- 
ported to the third (fourth) qubit. The defect pattern of this 
circuit is shown in fig. [7| 



The second step is achieving selective destination tele- 
portation of data. Fig. [6J3 shows how this can be achieved 
using the surface code. If one measures, using a flat cap, 
the first (second) defect pair, the data is teleported to the 
third (fourth) defect pair. Fig. [7] shows that if the first 
pair is measured, the second and fourth pair of defects 
become topologically disconnected and therefore do not 
influence the computation. 

Selective destination teleportation can be used to in- 
clude or exclude a corrective S gate as shown in fig. [8] 
As soon as the first logical Z measurement is performed, 
we know whether an S gate is required and can selec- 
tively apply the measurement cap to the appropriate pair 
of defects. Meanwhile, we can perform additional Clif- 
ford circuitry and even the measurement associated with 
another T gate. The measurement results of this sub- 
sequent T gates, the selective teleportation, and indeed 
all of the measurements associated with the topological 
structures before these measurements, can then be used 
to determined the next selective inclusion of an S gate. 
This technique easily generalizes to the implementation 
of an arbitrary Clifford circuit and arbitrary number of 
parallel T gates in the effective time of a single physical 
measurement. 

Overhead is dominated by the preparation of \A) states 




FIG. 7: Defect pattern implementing the selective destination 
teleportation of fig. [SJ:. Note that no further manipulation of 
the second pair of defects will be performed, and hence the 
fourth pair of defects becomes topologically disconnected and 
therefore does not influence the computation. 
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FIG. 8: Time-optimal method of implementing a sequence 
of T gates interleaved with Clifford circuitry. The first (top) 
logical measurement, which takes the same amount of time as 
a single physical measurement (figs. [6^-b), tells us whether 
an S gate correction is required, which we can selectively 
patch into the computation using teleportation (figs. [SJs-^. 
While this is being done, Clifford circuitry and the next T 
gate can be executed. The logical measurement results asso- 
ciated with the selective teleportation and the second T gate 
are used to determine the selective teleportation required by 
the second T gate. Each Clifford plus T layer of circuitry 
thus can be executed in the time of a single physical mea- 
surement. Little space-time volume overhead is associated 
with achieving time-optimal execution, however faster exe- 
cution and constant space-time volume implies the need for 
additional qubits. 
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for use in T gates. The volume of the entire structure 
in fig. [7] is ~50 (count the number of small cubes in a 
minimum size cuboid volume containing the structure). 
This volume could be substantially reduced using bridge 
compression techniques [19]. The volume of preparing a 
single I A) state is either approximately 200 or 600 de- 
pending on the fidelity required [19] , so even making use 
of the uncompressed structure of fig. [7] achieving time- 
optimal quantum computation requires a worst-case 25% 
space-time volume increase. For larger problems using 
the larger volume state distillation structure and com- 
pressed selective teleportation, this additional overhead 
would drop below 5%. 

In the work of [15], we estimated that d = 34 was 
required to factor a 2000 bit integer using a particu- 
lar superconducting architecture with measurement time 
100 ns and total error detection cycle time 200 ns. Us- 
ing this hardware, the canonical implementation of T in 
fig.|3]would take 76.5 /is versus just 100 ns (assuming fast 
classical processing and signalling) using the techniques 
summarized in fig. [8] This is an improvement of nearly 
3 orders of magnitude. 

In summary, we have described a technique enabling 
the surface code to achieve time-optimal quantum com- 
putation with negligible additional space-time overhead. 
Algorithm execution can be performed in time equal to 
the number of layers of T gates times the physical mea- 
surement time. The qubit overhead is largely determined 
by the total number of T gates. This motivates the search 
for low T depth and count quantum circuit implementa- 
tions of algorithms. Our result is fundamentally depen- 
dent on the assumption of fast classical processing and 
signalling, which is reasonable given the required classi- 
cal processing of the surface code can be performed in 
O(l) time using a uniform array of processors on av- 
erage communicating only with their nearest neighbors 
[29] . Nevertheless, our desire to assume that this clas- 
sical processing occurs in a time negligible compared to 
the quantum measurement time motivates further study 
of the classical processing, including implementation in 
hardware. Long-term, implementation in high-speed, low 
power dissipation, single flux quantum superconducting 
technology is an attractive option [32j [33] . 
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